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Abstract 

The energy measurement of jets produced by 6-quarks at hadron colliders suffers 
from biases due to the peculiarities of the hadronization and decay of the originating 
B hadron. The impact of these effects can be estimated by reconstructing the mass 
of Z boson decays into pairs of 6-quark jets. From a sample of 584 pb" 1 of data 
collected by the CDF experiment in 1.96 TeV proton-antiproton collisions at the 
Tevatron collider, we show how the Z signal can be identified and measured. Using 
the reconstructed mass of Z candidates we determine a jet energy scale factor for 
6-quark jets with a precision better than 2%. This measurement allows a reduction 
of one of the dominant source of uncertainty in analyses based on high transverse 
momentum 6-quark jets. We also determine, as a cross-check of our analysis, the Z 
boson cross section in hadronic collisions using the bb final state as az X B(Z — > 
66) = 1578lgg pb. 
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1 Introduction 



Since their discovery in 1983 [1] , W and Z bosons have been studied at hadron 
colliders predominantly using their leptonic decays. In fact, the hadronic de- 
cays of these particles are generally so difficult to separate from the large 
background arising from generic jet pairs produced by quantum chromody- 
namics (QCD) interactions that, after the extraction of a mass bump in the 
dijet mass distribution by the UA2 collaboration from 630 GeV data at the 
SppS in 1987 [2J, little more has emerged. At the Fermilab Tevatron the direct 
observation of hadronic decays of vector bosons is more difficult. With respect 
to the SppS, the Tevatron's higher center-of-mass energy is a disadvantage 
because, although the signal cross section is four times larger, the irreducible 
background from QCD processes yielding jet pairs increases by over an order 
of magnitude. 

During Run I at the Tevatron (1992-96) hadronic W decays were successfully 
used by the CDF and D0 experiments in the discovery and measurement 
of the top quark both in the single lepton and fully hadronic final states. 
The larger Run II (started in 2001) data sample made it possible to exploit 
the hadronic decay of W bosons in top events for a direct calibration of the 
energy measurement of light-quark jets in the reconstruction of the tt decay. 
That technique provides a significant reduction of the systematic uncertainty 
arising from the jet energy measurement [3J. Tevatron experiments can indeed 
reach an accuracy close to 1 GeV/c 2 on the top quark mass in Run II by 
reducing to about 1% the uncertainty on the jet energy scale (JES), a factor 
which measures the discrepancy between the effect of detector response and 
energy corrections in real and simulated hadronic jets. 

For the Z boson, which is not produced in top decays, the observation in 
hadronic final states is more complicated. Only the decay to identified 6-quark 
pairs is observable because the enormous gg — > gg background becomes sig- 
nificantly reduced. Indeed, a small signal of Z — > bb decays was extracted by 
CDF in Run I data exploiting the semileptonic decay of b quarks by triggering 
on muons with low transverse momentum [3]. The signal was too small to 
extract information on the accuracy of the 6-jet energy measurement, but it 
spurred the development of a dedicated trigger in Run II. A large signal of 
Z —>bb decays free from selection biases allows a precise measurement of the 
energy scale of 6-quark jets and provides a determination of the 6-jet energy 
resolution. The reduction of the uncertainty in the 6-jet energy scale (b- JES), 
that measures the ratio between the calorimeter response in real and simu- 
lated 6-quark jets, can help all precision measurements of the top quark mass, 
while a determination of the 6-jet energy resolution is important for the search 
for a low-mass Higgs boson. The signal allows a direct tuning of algorithms 
that seek to increase the resolution of the 6-jet energy measurement. These 
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algorithms are a critical ingredient for the potential observation of the Higgs 
boson at the Tevatron if Mh < 135 GeV/c 2 . 



The largest contribution to the total uncertainty in the top quark mass de- 
termination at the Tevatron originates from the knowledge of the jet energy 
scale. The JES can be determined from studies of detector response to single 
particles and a detailed modeling of their spectrum in jets [5]. An estimate 
of the accuracy of the resulting calibration can then be obtained with events 
containing a single photon recoiling against a hadronic jet, but the method 
is limited by systematic uncertainties arising from the modeling of the pro- 
duction process. A check of the light-quark calibration, which is essentially 
statistics-limited, comes instead from the measurement of W — > qq' decays in 
ti events, as mentioned earlier. 



When dealing with 6-quark jets one has to cope with their unique fragmen- 
tation and decay properties, that result in uncertainties in the jets energy 
response. The B hadron resulting from 6-quark fragmentation carries a larger 
fraction of the parent quark momentum than hadrons originating from light 
quark fragmentation. Uncertainties in the jet energy response arise from the 
imperfect knowledge of the fragmentation properties of 6-quarks. Moreover 
6-jets have a different response on average than light-quark and gluon jets 
because of the large semi-leptonic decay fraction of B hadrons. The imperfect 
knowledge of the decay properties of 6-quarks yield an additional uncertainty 
on the jet energy response. Those effects have to be accurately modeled if one 
is to apply a generic JES factor, derived from light-quark and gluon jets, to 
the two 6-jets emitted in ti decays. 



Due to the small cross section of production processes yielding events with a 
high-energy photon recoiling against a 6-quark jet, a measurement of the b- 
JES with transverse balancing techniques [5] is quite difficult, and the Tevatron 
experiments have so far been unable to exploit them. In this paper, however, we 
demonstrate the feasibility of extracting the b- JES from the reconstructed Z — > 
bb signal. We first discuss in detail the data sample we use and its collection and 
reconstruction; then we describe the method by which we model the spectrum 
of the huge background from QCD events and our fitting technique. We then 
evaluate the systematic uncertainties affecting the signal extraction. Finally, 
we present our measurement of the b- JES factor, which achieves an uncertainty 
better than 2% using an integrated luminosity of 584 pb _1 of CDF Run II 
data, and we provide for the first time, as a cross check of our analysis, a 
measurement of the Z boson cross section from its bb final state. 
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2 The detector and the datasets 



In this section we provide a short description of the experimental facility and 
discuss the method by which the sample of data used for our analysis is col- 
lected. We also describe the software reconstruction of the event characteristics 
that are most relevant to the measurement of Z boson decays to 6-quark pairs. 



2.1 The CDF detector 



CDF is a magnetic spectrometer designed to detect and measure, over a wide 
range of rapidity, charged and neutral particles produced by 1.96 TeV proton- 
antiproton collisions delivered by the Tevatron collider. The detector is de- 
scribed in detail elsewhere [6]; in this section we only describe those compo- 
nents most relevant to the Z — > bb measurement. 

A seven-layer silicon vertex detector (SVX) is located immediately outside 
of the beam pipe. Its microstrip sensors measure with a precision of about 
ten micrometers the position where charged particles cross each layer; that 
information allows the discrimination of tracks originated from the decay of 
long-lived particles, as explained in section 12.41 Outside the silicon detector, a 
large cylindrical multilayer drift chamber, the Central Outer Tracker (COT), 
measures track momenta within the pseudorapidity [7] interval \t]\ < 1.0 from 
the curvature of their helices in the 1.4 T axial magnetic field. Electromagnetic 
and hadronic sampling calorimeters, arranged in a projective-tower geometry, 
surround the tracking systems and measure the energy and direction of elec- 
trons, photons, and jets in the range |^| < 3.6. Muon systems outside the 
calorimeters allow the reconstruction of track segments for penetrating parti- 
cles within 1 77 1 < 1.5. The beam luminosity is determined using gas Cherenkov 
counters surrounding the beam pipe, which measure the average number of 
inelastic pp collisions per bunch crossing. 

Data acquisition is initiated by a three-level trigger system [5] . Level 1 and level 
2 consist of dedicated hardware modules, while level 3 runs speed-optimized 
reconstruction algorithms on a farm of commercial processors. Of particular 
relevance to our analysis are the hardware devices reconstructing charged par- 
ticle tracks in the level 1 (XFT) and level 2 (SVT) trigger systems. XFT [9], 
the extremely fast tracker, uses hits in the COT and a fast hardware architec- 
ture to measure transverse momentum and azimuthal angle of charged tracks. 
SVT [10], the Silicon Vertex Trigger, is a highly parallel system which allows 
the measurement of the impact parameter of tracks using SVX information, 
with precision (35 /zm for 2 GeV/c tracks) similar to that obtained offline. 
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2.2 The Z — > bb trigger 



A trigger (called Z_BB), selecting events with low transverse energy jet pairs 
and containing tracks with significant impact parameter, was implemented in 
Run II as a mean of acquiring a large sample of Z decays to pairs of 6-jets. The 
trigger design was based on the ability of the SVT to identify and measure 
tracks with a significant value of do, the impact parameter with respect to the 
beam position in the transverse plane. 

During its operation, the trigger underwent a few small modifications to main- 
tain a manageable trigger rate as the Tevatron instantaneous luminosity in- 
creased. Because even small trigger modifications may affect the shape of the 
background dijet mass distribution in a way which is very difficult to model 
with the necessary accuracy, this measurement focuses on approximately half 
of the total data collected in the years 2001-2006, corresponding to a period 
of data taking when the trigger did not undergo significant changes. 

The version of the Z_BB trigger that collected our data requires two XFT 
tracks and a localized calorimeter energy deposit at level 1, two low- Et calorime- 
ter clusters [6] and two displaced SVT tracks at level 2, and two reconstructed 
jets and two large impact parameter tracks at level 3. Specifically, the following 
selection is applied to each event: 

• level 1 requires one central (|?7| < 1.1) calorimeter tower with transverse 
energy Ej- > 5 GeV, plus two XFT tracks with transverse momentum above 
5.5 and 2.5 GeV/c, respectively; 

• level 2 vetoes events containing a calorimeter cluster of Et > 5 GeV in 
the pseudorapidity range 1.1 < \t)\ < 3.6. Events are required to have two 
central Et > 3 GeV clusters plus two SVT tracks with Pt > 2 GeV/c and 
impact parameter do greater than 160 fim. 

• level 3 requires two central jets having Et > 10 GeV. The event must also 
contain two tracks with Pt > 2 GeV/c and do > 160 /im; alternatively, 
track pairs with looser cuts (Pt > 1.5 GeV/c, d > 130 /im) are accepted 
if the impact parameter is more than three times larger than its estimated 
measurement error. 

A dynamic prescaling was applied in the level 2 trigger during a portion of 
the period of activity of the above trigger. Dynamic prescaling automatically 
rejects a variable fraction of the data passed by the trigger; the fraction de- 
pends on the instantaneous luminosity of the colliding beams, and it is tuned 
to keep the output rate of the trigger system within the storage capabilities 
of the data acquisition system. The main effect of the prescaling factor is to 
reduce the effective integrated luminosity of the data; for the Z_BB trigger the 
maximum reduction is a factor of 10 at the highest luminosity. 
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After a selection of runs tagged as "good" by the data quality monitor, the 
dataset contains a total of 39 million events. This corresponds to an integrated 
luminosity L = 584 pb -1 for the Z_BB trigger and the run range considered 
in our analysis. The total uncertainty on the integrated luminosity is 5.9%. 

2.3 Monte Carlo samples 

Version 6.216 of the Pythia Monte Carlo (MC) program [11] was used to gen- 
erate 7.4 million Z — > bb events with minimum-bias interactions superimposed 
according to the instantaneous luminosity of the real data. The Z boson sam- 
ple was generated using CTEQ5L [T2] parton density function (PDF) set. The 
Pythia tune A parameter set p2] was used to model the underlying event. 
Simulation of B hadron decays was performed using the EvtGen [T4] event 
generator. Event reconstruction was performed with the same offline algo- 
rithms used for real data. Trigger requirements were emulated using trigger 
primitives simulation (level 1 and level 2) and offline variables (level 3). 

In addition to the Z — > bb dataset, smaller samples of Z — > cc and W —>■ cs 
were also generated and reconstructed with the same recipe outlined above, to 
study the contamination of such processes in our data sample (see Sec. I4.4p . 
Moreover, several QCD Pythia Monte Carlo samples were used for additional 
studies on sample composition and background modeling. Finally, Z — > e + e~ 
and Z — > fi + [i~ MC samples, also generated with Pythia, were used for studies 
of the systematic uncertainty on signal acceptance due to initial state QCD 
radiation. These studies are described in section HI 

2-4 Event reconstruction 

Hadronic jets are reconstructed from calorimeter tower information using 
an iterative jet cone clustering algorithm, JetClu [T5], with the cone radius 
R = y/Ar] 2 + Acf) 2 = 0.7 units in the azimuth-pseudorapidity space. Jet Et 
is computed by summing the energy deposited in calorimeter towers within 
the cone multiplied by sin#, where 9 is the polar angle of the .Bp-weighted 
centroid of the clustered tower. 

The standard CDF jet energy correction package [5] determines the most prob- 
able energy of the parton that produced the jet by applying to the "raw" jet 
energy (which is labeled level 0) several factors in series, to account for detector 
non-uniformities (level 1), multiple pp interactions (level 4), calorimeter sta- 
bility and non-linear response, fragmentation model and Monte Carlo tuning 
(level 5) and other effects such as energy from the underlying event included 
in the jet cone (level 6) and energy lost out of the clustering cone (level 7). 
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Usually, CDF analyses seeking the reconstruction of massive object decays 
use jet energies corrected up to level 5 to reconstruct the kinematics of the 
final state, and apply separately custom corrections for those effects that are 
process-dependent; that is the case, for instance, of top quark mass measure- 
ments. We made the same choice of correction level in our analysis in order 
to measure a 6-jet energy scale factor that can be applied to other analyses 
based on high Pt 6-quark jets (provided that they use the same R = 0.7 jet 
cone definition). From now on, "corrected" jet energies will imply jet energies 
corrected to level 5, while "uncorrected" jet energies will stand for raw (level 
0) measured jet energies. 

For the purpose of maximizing the heavy flavor content in the dijet sample, 
we use the SecVtx 6-tagging algorithm to search for secondary vertices. In- 
deed, the long lifetime and high mass of B hadrons allow their decays to be 
well displaced from the primary pp interaction point, and thus form a sec- 
ondary vertex. The algorithm is described in detail elsewhere [6]. In short, 
SecVtx searches for secondary vertices in jets with uncorrected Et > 15 GeV 
and pseudorapidity in the range \t]\ < 2.0. The algorithm first selects charged 
tracks within the jet cone that have been measured in the silicon detector 
with sufficiently good position accuracy and that have a large significance of 
their impact parameter with respect to the interaction point. The algorithm 
then uses these tracks in a recursive procedure to reconstruct a common point 
of origin for at least three of them. If the reconstruction fails, tighter re- 
quirements are imposed on the tracks, and a fit accepting two-tracks vertices 
is attempted. Reconstructed vertices are rejected if their transverse distance 
from the interaction point corresponds to the location of material of the in- 
nermost silicon layer (1.2 cm < r < 1.5 cm) or if it is greater than 2.5 cm. 
If a good vertex is found, several quantities are computed with the tracks 
belonging to it. Among them, we use the vertex mass (see Sec. I2.5p . defined 
as the invariant mass of all tracks originating from the secondary vertex; in 
the computation of the vertex mass, tracks are assumed to be charged pions 
(Mt,- = 139 MeV/c 2 ), there being no possible discrimination available between 
different particle species. 

A jet is called taggable if it contains at least two tracks that pass all the 
selection criteria applied by the SecVtx algorithm other than the large impact 
parameter requirement. A jet is said to be b-tagged if a secondary vertex with 
good fit quality is found by the SecVtx algorithm and if the angle between the 
jet direction and the vector pointing from primary vertex to the secondary 
vertex is less than n/2. So-called negative tags are those failing the latter 
requirement: these are mostly light-quark or gluon jets with a fake secondary 
vertex due to imperfect track resolution. Tagged jets constitute a subclass of 
taggable jets. 

SecVtx has been demonstrated to efficiently tag 6-quark jets (40-50% efficiency 
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for 6-jets from top quark decay), with light-quark and gluon fake rates of 
less than 1% [6]. For the extraction of a Z boson decay signal, whose cross 
section is smaller by more than three orders of magnitude than that of generic 
dijet background, it is mandatory to maximize the 6-jet content of the data 
sample. Consequently both jets are required to be tagged by SecVtx. In the 
next section we show that this requirement is still needed for data collected 
with an impact parameter trigger. 



2. 5 Heavy flavor content of the data 



The trigger described in Sec. 12.21 produces a dataset containing a significant 
fraction of events with two central 6-quark jets. However, it is only through the 
direct reconstruction of the secondary vertex inside each of the two central jets 
that we can remove most of the light-quark and gluon background, because 
its production cross section is so large that it still dominates the data after 
trigger selection. 

We determine the fraction of events due to bb production using the vertex 
mass, which is sensitive to the flavor of the parton originating the jet: light 
quarks and gluons, which generate a secondary vertex tag only by virtue of 
track mismeasurement, produce vertices with low invariant mass on average; 
charm and bottom quarks have vertices with larger mass, and the latter is 
easily distinguishable from the former (see Fig{T]). 

For this study, the selection of clean dijet events was made tighter with respect 
to the initial data to ensure a better understanding of the event topology. 
Events are selected if they contain two jets in the central calorimeter (\t}\ < 
1.0) with uncorrected Et > 20 GeV and if there are no jets with uncorrected 
Et > 10 GeV in the \r]\ > 1.0 calorimeter regions. The data are divided into 
subsets based on the trigger settings and running conditions in order to check 
for variations in the sample composition. 

The results show that the mean fraction of events due to direct production of a 
central bb pair in our data sample is = 23 ± 2% in clean dijet events before 
SecVtx tagging requirements; F^ = 46 ± 5% in events with one jet containing 
a SecVtx tag; and F^ = 91ltio% m events with both jets SecVtx tagged. These 
studies therefore indicate that in order to collect a sample with high bb purity 
it is necessary to require secondary vertex tagging of both central jets. 



8 



0.06 : 
0.05 : 
0.04 - 
0.03 : 
0.02 : 
0.01 : 




Single tags 
data 

b quarks 
c quarks 
light quarks 



IT 



T* — „ 

Vertex mass (GeV/cj 




Fig. 1. Invariant mass of the charged tracks in the vertex for a sample of jets in sin- 
gle-tagged events (top) and for the tagged jet in double-tagged events (bottom). The 
templates show the relative fraction of b, c, and light quark or gluon jets estimated 
from the sample composition fit. 

3 Signal selection 



In this section we define the dijet system used for the reconstruction of the Z 
mass peak and the kinematic selection cuts we apply to increase the signal- 
to-background ratio. 



3. 1 Preliminary cuts and definition of the dijet system 



As we discussed in Sec. [2j our initial dataset consists of events that pass a level 
2 trigger requirement of two central calorimeter clusters with a very low Et 
threshold, which was set at 3 GeV to affect as little as possible the shape of the 
turn-on in the dijet mass distribution. The low threshold is needed because 
the Et of soft jets is measured rather poorly by the level 2 trigger hardware 
cluster finder [16] . In the level 3 trigger, jets are required to have Et > 10 
GeV using the standard CDF jet cone algorithm. The dijet mass turn-on is 
significantly affected by this requirement because the jet energies are still un- 
corrected. However this requirement is mandatory to keep the trigger rate at 
an acceptable level. We must carefully model the effect of the trigger selection 
on the mass distribution. This is made easier if the region of maximum vari- 
ation in the jet reconstruction efficiency, the turn-on region, is discarded by 
means of a sharp offline cut. 
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Fig. 2. Jet Et distributions for the two leading jets in experimental data. Left: 
corrected Et of the leading jet for all events (continuous line) and events passing 
the preliminary selection (dashed line). Right: same, for the second jet. 

We studied the raw and corrected E T distributions of the two leading jets 
in our data in order to select an initial threshold to define our dijet sample. 
The jets we considered have detector pseudorapidity in the range \rjd\ < 1.0, a 
selection that reflects the cuts applied by the trigger and which limits threshold 
effects in the dijet mass distribution. 



Given our goal of determining a scale factor for the energy of jets corrected to 
level 5 by means of a fit to the Z — > bb signal, the most sensible way to define 
the dijet system is to use jet energies corrected to that level. Based on Fig. [2J 
we require that both jets have corrected Et > 22 GeV. We thus select events 
unbiased by the rise of the trigger turn-on curve. Below is the step-by-step 
procedure adopted to define the central dijet system used to compute the dijet 
mass. 



(1) Reconstruct jets with the JetClu algorithm using a cone radius R = 0.7; 
correct the energy of jets to level 5; 

(2) order the list of jets by decreasing value of Et', require that the first two 
jets have E T > 22 GeV; 

(3) select central dijet events by requiring that the two leading jets have 
detector pseudorapidity in the range \r)d\ < 1.0; 

(4) require both leading jets to be defined as taggable by SecVtx. 

Table Q] details the number of events passing these cuts. 



3.2 Optimization of the kinematical selection 



The selection of Z bosons that decay to 6-quark jets and their discrimination 
from the huge QCD background is a difficult experimental problem. Back- 
ground events differ only slightly from the signal: they feature a higher proba- 
bility of gluon radiation from the initial state, a different color configuration of 
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the final state, and of course a non-resonant structure in the invariant mass. 
We studied many kinematic variables which could in principle be sensitive 
to these differences. However, very little can be done after selecting a clean 
dijet system in which both jets are 5-tagged. The most sensitive quantities 
providing additional discrimination are the transverse energy of the third jet, 
Ej,, and the angle between the leading two jets in the plane transverse to the 
beam, A$x2- A comparison of data and Monte Carlo for these variables is 
shown in Fig. [3l 




— Z^bb MC 
-■--Data 



2 2.2 2.4 2.6 2.8 3 10 20 30 40 50 60 70 80 90 100 

A* 12 (rad) E, 3 rd jet (GeV) 

Fig. 3. Distributions (normalized to unity) of the variables used to optimize the 
kinematical selection, for data and Monte Carlo (Pythia). Left: azimuthal angle 
between the leading jets; right: corrected Et of the third jet. 

To determine the best cuts on these two variables we employ pseudo-experiments, 
which allow us to estimate the fe-jet energy scale uncertainty we would obtain 
from a fit to the dijet mass distribution with the number of events and signal 
fraction expected for a given set of kinematic cuts (A$ 12 > x, Ej. < y, where 
x varies from 2.0 to 3.1 radians and y varies from 1 to 25 GeV). 

We construct pseudo-data distributions of the dijet invariant mass from a data- 
driven background template (see Sec. 15.11) and a Z — > bb MC signal template 
corresponding to a 6-jet energy scale k = 1.0. For each pseudo-experiment, we 
perform a simple two-components x 2 fit- We use as a background template in 
the fit the same distribution used to generate the background mass values in 
the pseudo-data distribution, while for the signal we use in turn 21 different 
templates generated at varying values of the 6-jet energy scale k from 0.9 to 
1.1. The width of the resulting \ 2 curve gives us the accuracy of the fit. 

To first order, the minimum uncertainty in the fe-jet energy scale among a set 
of fits with varying data statistics and signal to noise (S/N) values should 
correspond to the largest value of S/ y/N. However, we must also consider a 
systematic uncertainty on the background shape. A priori studies of the ac- 
curacy of our background modeling performed on background-enriched data 
events where only one jet is 6-tagged (very low signal fraction: S/N ~ 0.2%) 
allow us to predict a shape uncertainty of 1%. For these pseudo-experiments, 
therefore, we include the effect of a background uncertainty in the fits by 
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adding in quadrature an uncertainty equal to 1% of the bin content to the 
uncertainty of each bin in the pseudo-data distribution. The effect of an in- 
flation by 1% in the statistical error is not exactly the same as a systematic 
uncertainty in the knowledge of the background shape, but it is a good first- 
order guess of the impact that a similar uncertainty has on a fit to the dijet 
mass distribution with a given signal fraction. This is used only to make our 
a priori choice of the Ej. and A$i2 cut values. Pseudo-experiments show that 
the minimum uncertainty on the 6-JES can be obtained by selecting the data 
with cuts at E^ < 15 GeV and A$ 12 > 3.0 radians. Those are the cuts we use 
to define the region in which we search for Z — > bb signal. 

The total data with two central 6-tagged jets passing the preliminary selections 
and the A$i 2 and Ej, cuts amounts to 267 246 events. The predicted signal 
fraction in this sample is about 1.7%. A more precise estimate of this fraction 
is provided in the next section. 



4 Signal acceptance and related uncertainties 

In this section we estimate the amount of Z — > bb signal in the selected data. 
That number is an input to fits we describe in Sec. 15.41 to extract the 6-jet 
energy scale factor from the data, because a signal normalization constraint 
helps reduce the fit uncertainties. We also evaluate in this section the con- 
tributions from different sources of contamination to our sample. Finally we 
discuss the impact of our modeling of initial state radiation on signal accep- 
tance uncertainty. 

4-1 Number of events in the signal region 

Monte Carlo events of Z — > bb signal were required to pass the Z_BB trigger 
simulation and the same offline cuts applied to the experimental data. Multi- 
plying the signal efficiency obtained from MC to the cross section for Z boson 
production times the branching ratio of Z decay to 6-quark pairs [17] and to 
the total integrated luminosity of the data, we obtain a raw estimate of 9478 
events with two SecVtx taggable jets passing the kinematic selection, 4164 of 
them with both jets 6-tagged. 

The modeling of 6-quark tagging in the Monte Carlo simulations depends on 
subtle characteristics: physics effects such as the admixture of B mesons and 
baryons produced in the fragmentation, their lifetimes, and their momentum 
spectra; and detector modeling issues such as the detailed description of track 
position measurements in the silicon microstrip detector and track-finding ef- 
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ficiency inside jets. For these reasons, CDF measures a 5-tagging scale factor, 
defined as the ratio of SecVtx vertex-finding efficiency on 6-quark jets in the 
data to that in Monte Carlo simulated 6-jets. The SecVtx 6-tagging efficiency 
scale factor has been determined for the run range used in the present analy- 
sis [H] as: 

SF(b-tagging) = 0.95 ± O.Ol(stat) ± 0.04(syst). (1) 



In events where both leading jets are 6-tagged, this scale factor is squared. 

For a proper estimation of the signal efficiency we determined the accuracy of 
the Z_BB trigger requirements in data and Monte Carlo. With that aim, the 
selections used in the Z_BB trigger can be divided in two parts: requirements 
on the online-measured energy deposits in the calorimeters, and requirements 
on online-reconstructed tracks. The data-MC difference in the calorimeter- 
based efficiencies can be studied by comparing inclusive jet samples of data 
and simulation, while the differences in track requirement efficiencies can be 
studied in a subset of jet samples enriched in 6-quark jets. The data-MC 
difference in the trigger efficiency can then be expressed as 

SF[e data /e MC ] = S'F(calor-trigger) x S'F(track-trigger). 

By comparing jet samples in data and Monte Carlo we found a data/MC ratio 
of S'-F(calor-trigger) = 1.10 ± O.Ol(stat) ± O.Ol(syst) in events containing two 
jets with energy similar to those of Z decay. For tracking requirements, the 
data/MC ratio was measured to be SF (track-trigger) = 1.12 ± 0.06(stat) ± 
0.08(syst) by using dijet events where both jets contained a signal of 6-quark 
decay. 

Using the numbers described above, we can now compute our estimate for 
the number of events with two positive SecVtx 6-tags expected in the signal 
region defined in section 13.21 



N^xp = (raw) x [S'F(b-tagging)] 2 x S'F(calor-trigger) x ^(track-trigg 
= 4164 x (0.95) 2 x (1.10) x (1.12) = 4630 events. 



The various sources of uncertainty affecting this estimate are discussed in the 
remainder of this section. 
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4-2 Initial state radiation uncertainty 



The kinematic selection applied to our data makes the expected number of 
signal events dependent on the jet activity in the event, and in particular on 
the modeling of initial state QCD radiation (ISR). 

To estimate the acceptance systematics due to the modeling of ISR, we studied 
a sample of Z — > e + e~ and Z — ► n + fi~ events collected by high-P^ electron 
and muon triggers in the same run range. We thus in effect substitute electrons 
(muons) for the 6-tagged jets, but we include cuts that are used in the Z —>bb 
analysis in order to mimic the selection biases our data is subjected to: for 
instance, the jet E\ cut is applied to the leading jet of the Z — > e + e~ 
events. 



Large samples of Monte Carlo events were produced using Pythia version 6.216 
to compare to leptonic Z decays in the data. The application of a standard 
selection [19] produces a total of 14 507 Z — > e + e~ candidates and 13 727 
Z — ► candidates in experimental data, while the Monte Carlo samples 

contain 249 540 and 427 495 events, respectively. Figs. H] and [5] show the Pt 
of the Z, the leading jet corrected Et, the leading jet 77 and the A(f> of the 
electron pair after the Z cuts, for both the electron data and MC. 




Data 
Pythia 



Data 
Pythia 
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Fig. 4. Left: Pt distribution of the Z boson for electron data and MC. Right: cor- 
rected Et distribution of the leading jet for electron data and MC. All distributions 
are normalized to unity. 

We apply to the leptonic Z samples kinematic cuts corresponding to those we 
use to select our signal sample of Z — > bb data: the level 5 corrected Et of 
the leading jet (mimicking the third jet in the Z — > bb dataset) is required to 
be less than 15 GeV, and the A$ between the two leptons is required to be 
greater than 3.0 radians. 



In order to calculate the ISR systematic uncertainty on signal acceptance, 
we re-weight the MC events so that they perfectly match the distribution of 
electron and muon data in three different distributions: the Pt of the Z, the 
corrected Et of the leading jet, and the A$ between the two leptons. By 
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Fig. 5. Left: rj distribution of the leading jet for electron data and MC. Right: A<fi 
distributions of the electron pair. All distributions are normalized to unity. 

dividing the difference between the weighted and unweighted MC events by 
the unweighted MC events, we determine the systematic uncertainty on the 
acceptance. Because these measurements are highly correlated with each other, 
we take the largest of these values, 4.6% for the electrons and 3.0% for the 
muons, and average them together. This average, 3.8%, is our final estimate for 
a systematic uncertainty on the acceptance of our kinematic selection resulting 
from ISR. 

4-3 Uncertainty on the expected signal 

There are several independent sources of uncertainty on our estimate of the 
number of Z — > bb events: 

• the statistical uncertainty on the signal efficiency calculated from MC (due 
to the finite sample): 1.0%; 

• the uncertainty on the integrated luminosity of our dataset: 5.9%; 

• the uncertainty on the data/MC 6-tagging scale factor is evaluated from 
Eq. ([T]): it amounts to 4.3%, and it effectively doubles since we require two 
6-tags per event in our analysis; 

• the uncertainty in the acceptance of the Z_BB trigger is described in Sec. 14.11 
The effect is divided into calorimeter-cut modeling uncertainties and track- 
cut modeling uncertainties, which respectively amount to 1.8% and 8.9%; 

• the acceptance uncertainty from ISR modeling in the Monte Carlo was 
estimated in the previous section. This amounts to an additional 3.8% sys- 
tematic uncertainty. 

• uncertainty related to the modeling of final state radiation (FSR): by varying 
the parameters governing QCD radiation of final state 6-quarks in Z Monte 
Carlo events an additional 2.9% uncertainty is estimated. 

The above sources of uncertainty on the signal acceptance are summarized in 
table [2 The total quadratic sum of these uncertainties amounts to 14.7%. We 
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thus expect iV++ = 4630 ±681 events of signal to pass all the steps of our 
event selection. 



4-4 Residual contaminations from other physics processes 

We have defined, in section 13.21 the signal region as the sample of events 
with two central leading 6-tagged jets that pass our A$ and Ej, kinematical 
selections. On the other hand, events with two 6-tags failing either one of these 
A$ and Ej, cuts, as well as events with two taggable jets (that pass or fail 
the kinematical selections) are background-enriched regions that are used to 
construct the background shape used in the fitting procedure (see section [5]). 

All these samples are dominated by the large background from generic QCD 
2 — > 2 scattering reactions. However a few additional physics processes can 
contaminate this data. Among these are Z — > bb events leaking into the back- 
ground regions, and Z — > cc and W —> cs events that pass our selections since 
c-quarks can produce a secondary vertex. 

Z decays present in the signal zone defined by the cuts A$ > 3.0 and E\ < 15 
GeV amount to about 1.7% in events with two 6-tagged jets, while the fraction 
is 0.2% in events with two taggable jets. Among events failing the kinematical 
cuts, the fraction of Z — > bb decays is equal to 0.8% in events with two 6-tagged 
jets, and falls by an order of magnitude (0.08%) in events with two taggable 
jets. The signal contamination in samples used to construct our background 
model will be accounted for in section [5j since a fraction of signal higher than 
0.10% starts to affect appreciably the shape of the dijet mass distribution and 
thus the modeling of our background function. 

Samples of Z —> cc and W — > cs MC events have been studied to evaluate the 
contamination of these physics processes in our data. For the Z — > cc channel 
we expect, in data with two taggable jets, a contamination below 0.02% for 
events that pass the A$, Ej, kinematical selections and a contamination of 
about 0.01% for events that fail these selections. The contamination of this 
process in data with two tagged jets is 0.03% for the signal region and about 
0.02% in the background region. Similarly, for W — > cs channel we expect, in 
events with two taggable jets, a contamination below 0.05% for events that 
pass the kinematical selection and a contamination of about 0.03% for events 
that fail this selection. The contamination in events with two tagged jets 
is negligible (below 0.01%). Each of these contributions is smaller than the 
smallest contribution coming from Z — > bb process. In summary, the Z — > cc 
and W — > cs decays have a negligible impact on the determination of 6-jet 
energy scale factor, and we do not correct our data for the presence of these 
processes. 
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5 The fitting procedure 



In this section we describe the parametrization of signal and background tem- 
plates and the unbinned likelihood fit method we employ to extract the Z 
signal and measure the 6-jet energy scale factor. 



5. 1 Construction of background and signal templates 



5.1.1 Background templates 

To model the dijet mass shape of the background collected in double 6-tagged 
data after the selection of events with A$ 12 > 3.0 and Ej, < 15 GeV we rely 
on a three-step procedure: 

(1) we first determine the invariant mass shape of the ratio R(rrijj) between 
events with two positive SecVtx 6-tags (labeled "(++)") and events with 
two taggable jets (labeled "(00)"), in several regions with poor signal 
fraction. Such regions are defined by selecting events that fail one or both 
of the cuts A$ 12 > x and E$ < y (with x < 3.0, y > 15 GeV). These 
background regions have by construction zero overlap with the region of 
kinematical space where we look for the signal. R is thus defined as 

7~> / x N ++ (m jj ;x,y) 

N 00 (m jj ;x 1 y) 

(2) the ratio R is then multiplied by the mass distribution of events with two 
taggable jets found within the signal region (A$ 12 > 3.0, < 15 GeV) 
to construct a background template as a function of the two parameters 
x,y; 

(3) finally, the resulting distribution is fit with a continuous parameterization 
that well models the full spectrum of invariant masses between and 200 
GeV/c 2 . 

In order to describe the data-driven background shape, derived as discussed 
above, we choose as a probability density function (p.d.f ) the sum of a Pearson 
IV function [20J and an error function (erf): 



P b (m jj ) = P 




(2) 
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where (3i are the parameters of the p.d.f (see an example in Fig. [6]). 



X 2 /ndf 81.22/70 
Prob 0.1691 




120 140 160 180 200 

Dijet mass M.. (GeV/c 2 ) 



Dijet mass M.. (GeV/c ) 



Fig. 6. Example of a data-driven background dijet mass distribution fit with a Pear- 
son IV function plus an error function. 

If we select a background region too close to the signal region, it will be 
appreciably contaminated with signal events, while a background region based 
on extreme values of the parameters (very low x or high y for example) will 
select events kinematically quite different from those populating the signal 
region and will likely fail to provide a satisfactory model of the background 
in the signal region. Understanding the correlation between the selection of 
the background region and the tag rate function R is not trivial since the tag 
rate shape depends not only on the kinematic variables but also on the sample 
composition of data in the background region. We thus have no meaningful 
way to favor a priori a background model, that is, a particular choice of (x, y) 
over another. We thus consider a large set of background models in our fitting 
procedure. 

We construct different background models by scanning the (x, y) variables 
space: we vary the value of x from 2.50 to 3.00 rad in steps of 0.02 rad and 
we vary y from 15 to 25 GeV in steps of 1 GeV. This yields a total of 286 
possible forms of the ratio R. All these determinations are correlated to each 
other, but they possess slight differences. These differences translate into 286 
different background templates. 

The contamination of Z — > bb signal in the 286 background regions must be 
accounted for (see section H]). There is a 0.6-0.8% signal fraction in events 
with two 6-tagged jets, depending on the (x, y) selection used. The fraction 
of signal in taggable dijets in the background regions is instead a factor of 
ten smaller. We correct the signal contamination by estimating from Monte 
Carlo the amount of Z — » bb signal in the background region for each choice 
of kinematic cuts, and by subtracting the expected signal from each dijet 
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invariant mass template used to construct the tag rate function R. The mass 
distribution of events with two taggable jets in the signal region, to which the 
tag rate function is multiplied to construct the background shape, also contains 
a small contamination of signal, and is corrected accordingly. A data/MC b- 
JES factor of 1.00 is assumed when subtracting MC templates from the data. 
A systematic uncertainty on the final measurement will be estimated later to 
account for this specific choice (see section [6]). 

5.1.2 Signal templates 

To construct dijet mass signal templates we use distributions of fully simulated 
Monte Carlo Z — > bb events. In events that pass the Z_BB trigger simulation 
we apply a factor k to the energy of each jet in order to mimic a data/MC 
scale factor; we then apply to modified jet energies the same event and kine- 
matic selection applied on the data. As k varies from 0.90 to 1.10 in steps 
of 0.01 we can thus construct 21 different dijet mass templates for the sig- 
nal. Each of these distributions is fit to a sum of three gaussian functions, as 
shown in Fig. [7J To obtain one single probability density function which has 
dijet invariant mass and 6-jet energy scale as parameters, P s (rrijj,k), we fit 
simultaneously the 21 templates allowing each parameter of the three gaussian 
functions to vary linearly with the scale factor k. 



Fig. 7. Signal dijet mass distribution fitted with a sum of three gaussian functions 
(for k=1.0). 

5.2 The fitting function 

We use an unbinned likelihood procedure to measure the number of signal and 
background events (respectively n s and rib) and the 6-jet energy scale factor 




X 2 / ndf 83.01 / 64 
Prob 0.05534 
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k in our data. We find the probability that the observed data is described as 
an admixture of background events and Z — > bb events with a data/MC fo-jet 
energy scale k, by employing the following likelihood function: 



C{k) — £ s hape(k) X £( ns+nfc ) 



(3) 



with 




nsPjjnjf k) + n b P b {m j:j ) 
n s + n b 



and 



(n s +n b ) 



e -( n °+ n b)(n s + n b ) 

m 



N 



where the term C s h ap e{k) is the product, over all events, of the probability that 
the i th event with dijet mass rrijj is described by background p.d.f P b (rrijj) 
and signal p.d.f P s (rrijj;k), given a 6-jet energy scale (6-JES) factor k. The 
second term C^ ns+rib ) is introduced to constrain the total number of signal and 
background events (n s + n b ) to the event count iV in the selected data sample. 

We minimize — ln(£) to find the best 6-JES factor hypothesis. The statistical 
error is given by the difference between this scale factor and the scale factor 
at -ln(£) + 0.5. 

5.3 Test of the fitting procedure with pseudo- experiments 

Before applying our fitting method to experimental data, we use pseudo- 
experiments to test its performance. In particular, pseudo-experiments allow 
us to check whether the closeness of the signal to the peak in the background 
biases the resulting 6-JES and number of signal events. 

To perform pseudo-experiments we first construct a data-driven background 
distribution as discussed in Sec. 15.11 which we parameterize using the func- 
tion defined previously (Eq. [2]). Then for a given input signal fraction and 
simulated 6-JES factor we construct pseudo-data templates drawing n b back- 
ground events from the background p.d.f and n s signal events from the signal 
p.d.f (n b and n s are smeared according to Poisson statistics for each pseudo- 
experiment). Pseudo-data distributions are then fit using the unbinned like- 
lihood procedure and the output parameters (6-JES and number of events of 
signal and their statistical errors) are histogrammed. 
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We construct pseudo-data distributions varying the input signal fraction from 
1% to 3% and the input 5-JES factor from 0.95 to 1.05. For each selection we 
generate 1000 pseudo-experiments, each containing a number of events equal 
to that observed in the signal region for double tagged data (267 246 events). 

Fig.[H]shows the mean fitted output scale factor as a function of the input scale 
factor. The results obtained confirm that the 6-jet energy scale factor can be 
extracted from our data even if the signal fraction is very small and its shape 
peaks at a mass value not very far from the peak of the background shape. We 
observe no bias from our fitting method. However, for a check of systematic 
effects due to the finite statistics of the signal template, and for an evaluation 
of systematic effects due to the imprecise knowledge of the background shape, 
we need separate studies. These are described in Sec. [6j 

11 
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Fig. 8. Mean fitted b-JES factor as a function of the input scale factor. 



5.4 Fits to the selected data 

5.4-1 Extraction of the b-JES and Z — > bb signal 

The method we apply to perform the measurement of the 6-jet energy scale 
and number of events of Z — > bb signal in our selected sample of data is the 
following: 

(1) we scan the A<£> and kinematic space to construct 286 different data- 
driven dijet mass background models (see Sec. 15 . 1 . lj) : 

(2) each background shape is used in a preliminary binned likelihood fit of 
the selected data in a region of the dijet mass distribution containing 
little or no signal contamination: the "sideband" region is defined as the 
mass spectrum from to 60 GeV/c 2 and from 120 to 200 GeV/c 2 ; 

(3) we fit the data with the unbinned likelihood function defined in Sec. 15.21 
using in turn the different background models; 



1 .0% signal 
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(4) we choose as our measurement of the 5-JES the result obtained with the 
background shape which provided a fit to the sideband with the highest 
p- value (step 2); Results obtained with other background shapes are used 
to estimate a systematic uncertainty related to the degree of arbitrariness 
of our procedure. 

We increase the precision of our fit by including in the unbinned likelihood 
procedure a gaussian constraint on the expected number of signal events, 
computed with the Monte Carlo simulation. We thus add to the likelihood 
defined in Sec. 15. 21 a exp(— 3 ' 9 2 ~2 3 — ) term, where N sig is the number of signal 
events and n e s xp ±a n the MC prediction for the same number. Fig. shows the 
result of the unbinned likelihood fit to double SecVtx tagged data obtained 
with the background shape that best fits the sideband, when an s = 4630 ±681 
constraint (derived in Sec. H]) is applied. The fit returns N sig = 5621 ± 436 
events and a 6-JES factor equal to 0.974 ± 0.011 (errors are statistical only). 
The goodness of this fit is estimated by calculating the x 2 /NDF corresponding 
to the likelihood value at convergence, which is found to be 104/75. 




Fig. 9. Results of the constrained unbinned likelihood fit performed on double-tagged 
dijet data (points). A Gaussian constraint of 4630 ± 681 on the fitted number of 
signal events is applied. The data-driven background shape and Monte Carlo signal 
p.d.f are shown (hatched functions). The fit returns 5621 ± 436 signal events and 
a b-JES of 0.974 ± 0.011. The inset on the upper right shows the data minus the 
background distribution (points) and the signal shape normalized to the fitted number 
of events of signal. 
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5.4-2 Additional checks 



We first check if the excess of events observed just below 50 GeV/c 2 , which 
is due to an imperfect modeling of the rise of the dijet mass distribution 
(Fig. [9]), could bias our result. We perform an unbinned likelihood fit on the 
restricted mass range from 50 to 200 GeV/c 2 . This fit returns a 6-JES factor 
of 0.971 ± 0.011 and 5862 ± 440 signal events. The fit has a x7 NDF of 81/70. 
This result is in good agreement with the default measurement obtained by 
fitting the full mass window from to 200 GeV/c 2 . 

We then verify that the fitted 6-JES does not vary appreciably when the gaus- 
sian constraint on the number of signal events is removed from the unbinned 
likelihood procedure. We find a fitted 6-JES factor of 0.971 ±0.011 and a fitted 
signal of 6317±576 events (x 2 /NDF = 102/75). The signal increases by about 
700 events with respect to the constrained fit, but the 6-JES factor remains 
stable within its statistical uncertainty. 



6 Results on the 6-jet energy scale 

In this section we discuss the sources of systematic uncertainty affecting our 
determination of the 6-jet energy scale and quote a complete measurement for 
this quantity. 

6.1 Systematics related to the background modeling 

An evaluation of the systematic uncertainty on the 6-JES related to the choice 
of the background model used to perform our measurement requires care. 
In Sec. 15.41 we used the p- value of the sideband fits as a criterion to select 
the baseline background model, but the arbitrariness of that choice and the 
dependence of the background shape on the characteristics of the data might 
affect the result. To estimate the resulting systematic uncertainties we perform 
fits to the experimental data using in turn each of the 286 shapes of background 
models described in section 15.1.11 We then create a histogram of the results 
obtained for the 6-JES by each background shape, using the p-value of their 
respective sideband fits as a weight. We take the root-mean-square difference 
with respect to the most probable value, in the resulting distribution, as our 
estimate of the systematic uncertainty on the 6-JES factor. This results in an 
absolute error of t^oo 2 ,. 

Two additional sources of systematic uncertainty are associated with the back- 
ground modeling. The first one is due to the finite statistics of background 
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templates used to derive the probability density functions for our unbinned 
likelihood fit. To estimate the size of this effect we perform different set of 
pseudo-experiments using as input the background model which provides the 
best fit to experimental data in the sideband region of the dijet mass distribu- 
tion. For each set of pseudo-experiments we fluctuate with Poisson smearing 
the number of events in each bin of the background distribution and measure 
the resulting bias on the fitted 6-JES factor. The mean systematic uncertainty 
on the 6-JES is the found to be ±0.011. 

A different potential source of systematic uncertainty comes from correcting 
the background shape for signal contamination. As described in section 15.1.1} 
we correct the background templates for the presence of signal by subtracting 
the expected signal assuming a data/MC 6-JES of 1.00. We estimate the sys- 
tematics related to that assumption by performing again pseudo-experiments. 
An additional ±0.005 systematic uncertainty is attributed to the 6-JES factor 
from the correction. 

6.2 Systematics related to Monte Carlo signal templates 

For the pseudo-experiments described in Sec. 15.31 signal events used in the 
construction of pseudo-data samples are drawn from the signal p.d.f. That 
procedure is appropriate as long as we are testing the fitting procedure and 
estimating if the closeness of the signal to the background turn-on could yield 
any biases. An additional source of uncertainty may be associated to the dif- 
ferences between the original Monte Carlo Z —>■ bb dijet mass template and the 
signal p.d.f used in the unbinned likelihood procedure. To study that effect, 
we perform a further set of pseudo-experiments by drawing signal events from 
the original MC template. We estimate a systematic uncertainty of ±0.003 on 
the 6-JES factor from that source. 

A second source of systematic uncertainty is related to the finite statistics of 
the MC signal templates. To estimate this error we use a similar procedure to 
that used for the modeling of the background shape. Pseudo-experiments give 
a ±0.002 mean systematic uncertainty on the 6-JES. 

6. 3 Other sources of uncertainty 

There are a few additional sources of systematic uncertainty in the determi- 
nation of the 6-jet energy scale which, however, are not included in our quoted 
total systematic error. In fact, the use of the measurement of a 6-JES implies 
a choice of certain parameters and models in the simulation of Monte Carlo 
events: if the same choices are made as those we used in the generation of the 
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Z Monte Carlo sample (choice of generator, PDF set, and specific settings for 
initial and final state radiation modeling), there is no need to consider any 
systematic uncertainties affecting the 6-JES due to those sources. 

We nonetheless evaluate how a different choice of those parameters changes 
the value of the 6-JES. We use Z — > bb dijet mass templates from samples of 
Monte Carlo with increased and decreased initial state radiation (ISR) and 
final state radiation (FSR) in pseudo-experiments to estimate the systematic 
uncertainty on our measurement due to these sources. The result is an estimate 
of, respectively, ±0.004 and ±0.012 systematic uncertainty on the 6-JES factor. 

Additionally, we estimate the effect of the variation of the PDF set on re- 
constructed Z — > bb dijet mass templates and thus on the 6-jet energy scale 
determination. The estimate of the uncertainty is obtained by looking at the 
difference between results obtained using the default CTEQ5L set of PDF and 
those obtained with the MRST72 set. Additionally, MRST72 and MRST75 
sets derived using different Aqcd values are compared, and 20 eigenvectors 
defining the CTEQ6M set are independently varied by ±1 standard devia- 
tion. Differences in pseudo-experiments resulting from these different PDF's 
are added in quadrature to obtain a total uncertainty of ±0.005 on the fitted 
6-JES factor. 

6.4 Final result on the b-jet energy scale 

To summarize, table [3] details the sources of systematic uncertainty on the 
6-jet energy scale measurement. All sources are assumed to be uncorrelated; 
thus a total systematic uncertainty is calculated as the sum in quadrature of 
the various sources. Table H] shows other sources of uncertainties which are not 
included in the 6-JES factor measurement. 

Our final result for the 6-jet energy scale factor is 

k = 0.974 ± 0.011 (stat)lffi(syst) = 0.974±g;g?g (total). 

This k factor refers to R = 0.7 jets corrected with level 5 standard CDF 
jet corrections. In addition, it has been extracted from 6-quark jets tagged 
by the SecVtx algorithm, with transverse energy mostly in the range 22 < 
Et < 50 GeV, and it is relevant for comparison between data collected by the 
CDF experiment from 2003 to 2005 and corresponding Pythia Monte Carlo 
simulations. 

The energy range relevant to Z production is indeed restricted, though not 
dramatically different from that of jets from top quark decay. We believe 
our measurement of the 6-jet energy scale can be used in top mass analyses, 
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provided that the same jet cone definition is used (R = 0.7 jets). Analyses such 
as the top mass measurement in the dilepton channel could reduce sizably their 
systematic uncertainty by exploiting the correlations between the standard 
JES, the 6-jet energy scale and the measured top mass. Unfortunately the k 
factor we measured is not directly usable in analyses that use a different jet 
cone size (R = 0.4 for example). This is due to the fact that one of the largest 
systematic uncertainty in the generic jet energy scale correction is cone-size 
dependent (the level 7 jet correction). Thus, we cannot constrain, for example, 
the R = 0.4 jet cone uncertainties using a k factor extracted from R = 0.7 
jets. A future measurement of the 6-jet energy scale using different jet cones 
could then be useful. 

In the meantime, we believe our result can be used as an independent con- 
straint in measurements which are sensitive to high Pj- fe-jets and that use 
similar jet definitions as ours (jet range, cone size, etc.). 



7 Cross section determination 

In this section we evaluate the cross section for Z boson production multiplied 
by the branching ratio of Z boson decay to 6-quark pairs using the number of 
events of signal returned by a fit to the dijet mass distribution. 

7. 1 Method of measurement 

The procedure to extract the signal is slightly different with respect to what 
is done for the 6-jet energy scale measurement. First of all, since we aim to 
measure the Z boson cross section we cannot use a constraint to the number 
of expected signal events in the likelihood fit. Second, we cannot use the 
Z — > bb cross section value to estimate, and subtract, the number of events of 
signal that fall into the kinematical regions used to construct the dijet mass 
background shape (see section H]). One simple way to solve this problem is 
to perform an iterative fitting procedure to correct the signal contamination 
in the background, without any prior knowledge of the Z cross section. We 
use the following iterative method: an unbinned likelihood fit is performed to 
the selected data using an uncorrected background shape. In this fit the 6-jet 
energy scale is let free and no constraint is applied to the expected number 
of events of signal. From the number of events of signal thus measured we 
can extrapolate (from Z — > bb Monte Carlo) the number of signal events 
in the background data samples, and subtract the resulting contribution. A 
new dijet invariant mass shape is then constructed from these background 
corrected samples. This procedure converges after a few iterations to a stable 
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number of fitted signal events. To improve the precision of the measurement, 
the normalization of the background shape is constrained to the data sideband 
dijet mass region (rrijj < 60 GeV/c 2 and rrijj > 120 GeV/c 2 ), at each step of 
the iterative procedure. 

Apart from the fitting procedure, the general method of measurement is left 
unchanged with respect to the 6-jet energy scale measurement. A large set 
of background dijet invariant mass shapes is constructed scanning the A$ 
and Ef space, as described in section EJ Each background shape is used to 
fit the selected data with the usual unbinned likelihood procedure, with the 
difference that the fit is performed now iteratively and that the background 
shape is constrained to the data sideband. We take as our measurement of 
the number of signal events the result obtained with the background shape 
that best fits the sideband (after iteration), while the results obtained with 
the other background models contribute to the estimate of the systematic 
uncertainty. 

The measurement performed with the background shape that best fits the data 
sideband yields N sig = 6467 ± 504 (stat) events of signal and a 6-jet energy 
scale factor of 0.976 ± 0.010 (stat), in agreement with the result in Sec. El 
From the fitted number of signal events we extract a cross section using the 
formula 

a z x B(Z -> 66) = Nsig 

^kin ' ^tag ' 

where az and B(Z — > bb) are respectively the Z boson cross section and the 
branching ratio of Z decaying into a pair of 6-quarks. The signal acceptance 
after all kinematical selections is given by the term €ki n , while the 6-tagging 
efficiency is given by the term e tag - Finally L is the total integrated luminosity 
for the dataset on which the measurement is performed. 

7.2 Evaluation of systematic uncertainties 

The effect of all main sources of systematic uncertainties affecting the mea- 
sured cross section is summarized in table [5j Below we discuss each contribu- 
tion separately. 

The systematic error on total signal efficiency, e, due to standard CDF jet 
energy corrections, is estimated from Z — > bb MC where the energy of mea- 
sured jets is shifted by ±la c of the standard jet energy correction. The relative 
error on the signal efficiency is then calculated as ^ +1 ' Tc ~' ; ~ 1 ' T ^ and amounts to 
(1.6 ±1.1)%. 
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The effect of increased or decreased final state radiation (FSR) on signal ef- 
ficiency is evaluated using Z — > bb MC samples generated with different FSR 
tunings. The relative uncertainty on efficiency due to this source is found to 
be (2.9 ± 1.1)%. The systematic uncertainty related to initial state radiation 
was derived from Z —>■ data/MC comparison (where i is an electron or a 
muon, see Sec. I4.2[) and yields an additional 3.8% contribution to the total er- 
ror. The relative error on signal efficiency due to different parton distribution 
function parameterizations (see Sec. 16. 3j) is estimated to be (7.3 ± 3.0)%. An 
additional source of systematic error is related to the MC generator depen- 
dence. We evaluate it by comparing Pythia and Herwig [2T] Z —>■ bb samples. 
The resulting uncertainty on signal efficiency is found to be (2.2 ± 5.4)%. To 
be conservative we take the error on this measured value as our systematic. 

The systematic affecting the trigger efficiency measurement was also estimated 
(see Sec. 14. 1|) and found to be 9.1%. 

Several sources of systematic uncertainty related to the data-driven back- 
ground modeling and the fitting procedure were taken into account. The first 
one derives from the finite statistics in the templates used in the background 
model construction, and was estimated performing pseudo-experiments in a 
similar way as in Sec. 16. 1[ The mean uncertainty on the number of fitted sig- 
nal events is 8.3%. A second systematic uncertainty is related to the sideband 
criteria we applied to select the "best" background model used to perform the 
cross section measurement. As in Sec. 16.11 we create a histogram of the fitted 
number of events of signal obtained with each different background shapes 
using the p- value of their respective sideband fits as a weight. This fitting 
procedure, performed with the iterative method described previously, yields 
an uncertainty on the number of fitted Z —>■ bb events of li5Q%- 

We also estimated the effect on the number of fitted signal events due to the 
constraint of the background shape to the dijet mass spectrum sideband. In 
fact an additional systematical error can arise from the uncertainty on the 
background normalization or from the small leakage of the signal in the side- 
band. We evaluate this uncertainty performing pseudo-experiments where the 
background normalization is shifted by ilcrg {&b is the statistical uncertainty 
on the background normalization) . The mean relative error on the fitted num- 
ber of events of signal is measured to be 4.2%. 

Finally, a systematic due to the iterative background correction procedure is 
estimated. The iterative method relies on the extrapolation of the number of 
events of signal in the background templates, given a fitted number of events 
of signal in the data. These background correction functions are estimated 
from Monte Carlo and yield a statistical uncertainty. The systematic error 
due to this source is estimated from fits to the data performed with smeared 
correction functions. The mean uncertainty on the number of signal events is 
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estimated to be 1.6%. 

Two additional sources of systematic uncertainty must be accounted for, the 
6-jet tagging data/MC scale factor uncertainty (8.7%) and the error on the 
integrated luminosity measurement (5.9%). 

7.3 Results 

Using the integrated luminosity, trigger and kinematic efficiencies, and related 
systematic uncertainties described above, we derive: 

o z x B(Z -> 66) = 1578 ± 123(stat) lfjj}(syst) pb 
= 1578 i^(total) pb 

The measured cross section is higher but consistent within the uncertainties 
with the NLO theoretical calculation [TT| combined with the measured Z — > bb 
branching ratio [22J, which predicts oz x B(Z — > bb) = 1129 ± 22 pb. 

While this measurement has no appreciable impact on our knowledge of the 
production and decay mechanism of the Z boson in hadronic collisions, it does 
fill a gap in the picture of measurements of Standard Model production pro- 
cesses of vector bosons. If future searches for new physics -especially the Higgs 
boson and Supersymmetric particles- at the Tevatron and at the LHC prove 
successful, final states with 6-quark jets will be very important to study and 
measure. The current measurement and the described methodology provide 
a normalization point for the measurement of the production rate of new 66 
resonances, or for setting a limit on their cross section. 



8 Conclusions and perspectives 

We have shown in this paper how a sizable sample of Z boson decays to 6- 
quark pairs has been extracted from proton-antiproton collisions provided by 
the Tevatron collider. We have also described in detail the method we used to 
obtain a precise measurement of the 6-jet energy scale from the shape of the 
dijet mass distribution of the selected data. 

The measurement can be used to reduce the dominant systematic uncertainty 
in many of the CDF analyses which determine the top quark mass, provided 
that they use a similar jet definition to ours. The precise knowledge of the 
6-jet energy scale is of benefit also to analyses attempting to reconstruct new 
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resonances with decay to 6-quark jets, such as a low-mass Standard Model 
Higgs boson. 

The signal has also been used to measure the cross section for Z boson pro- 
duction using the bb final state: the result is a z x B(bb) = 1578^410 pb- 
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Figures captions 

Figure [TJ 

Invariant mass of the charged tracks in the vertex for a sample of jets in single- 
tagged events (top) and for the tagged jet in double-tagged events (bottom). 
The templates show the relative fraction of b, c, and light quark or gluon jets 
estimated from the sample composition fit. 

Figure [2j 

Jet Et distributions for the two leading jets in experimental data. Left: cor- 
rected Et of the leading jet for all events (continuous line) and events passing 
the preliminary selection (dashed line). Right: same, for the second jet. 

Figure [3J 

Distributions (normalized to unity) of the variables used to optimize the kine- 
matical selection, for data and Monte Carlo (Pythia). Left: azimuthal angle 
between the leading jets; right: corrected Et of the third jet. 

Figure @} 

Left: Pt distribution of the Z boson for electron data and MC. Right: corrected 
Et distribution of the leading jet for electron data and MC. All distributions 
are normalized to unity. 

Figure 

Left: r\ distribution of the leading jet for electron data and MC. Right: A(f) 
distributions of the electron pair. All distributions are normalized to unity. 

Figure [6j 

Example of a data-driven background dijet mass distribution fit with a Pearson 
IV function plus an error function. 

Figure [Tj 

Signal dijet mass distribution fitted with a sum of three gaussian functions (for 
k=1.0). 

Figure El 

Mean fitted b-JES factor as a function of the input scale factor. 
Figure 

Results of the constrained unbinned likelihood fit performed on double-tagged 
dijet data (points). A Gaussian constraint o/4630±681 on the fitted number 
of signal events is applied. The data-driven background shape and Monte Carlo 
signal p.d.f are shown (hatched functions). The fit returns 5621 ± 436 signal 
events and a b-JES o/0.974±0.011. The inset on the upper right shows the data 
minus the background distribution (points) and the signal shape normalized to 
the fitted number of events of signal. 
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Selection level 


Events 


Total analyzed events 


39 147 479 


Pass jet Et cuts 


23 950 515 


Pass jet \r]d\ cuts 


21 420 308 


Both jets taggable 


18 128 488 


One tagged jet 


6 205 578 


Two tagged jets 


699 590 


Pass A$i2 and Ef cuts 


267 246 



Table 1 

Statistics of the analyzed data at different levels of selection. 



Source of uncertainty 


Relative uncertainty 


Statistical 


1.0% 


Luminosity 


5.9% 


Data/MC 6-tagging SF 


8.7% 


Z_BB trigger simulation (calorimeter) 


1.8% 


Z_BB trigger simulation (tracks) 


8.9% 


ISR uncertainty 


3.8% 


Modeling of FSR 


2.9% 


Total uncertainty 


14.7% 



Table 2 

Uncertainties on the expected number of signal events. 



Systematic source 


6-JES factor 


Background choice 
Background statistics 
Background correction 
Monte Carlo template 
Monte Carlo statistics 


+0.012 -0.006 
0.011 
0.005 
0.003 
0.002 


Total 


+0.017 -0.014 



Table 3 

Summary of systematic uncertainties on the b-jet energy scale. The total uncertainty 
is obtained by adding the individual contributions in quadrature. 
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Source 


6-JES factor 


Monte Carlo ISR 


0.004 


Monte Carlo FSR 


0.012 


Monte Carlo PDF's 


0.005 



Table 4 

Other sources of uncertainties that are not included in the b-jet energy scale mea- 
surement. 



Systematic source 


Method 


Value 


Kinematical uncertainty 


JES 


±lc c jet corrections 


1.6% 


ISR 


Z -► £+£- data/MC 


3.8% 


FSR 


MC 


2.9% 


PDF 


MC reweighting 


7.3% 


MC gen. 


Pythia vs Herwig 


5.4% 


Trigger 


Low E t QCD data 


9.1% 


Relative uncertainty 




13.8% 


6-tagging efficiency 


Tagging eff. (two tags) 


data/MC scale factor 


0.903 ± 0.079 


Relative uncertainty 




8.7% 


Luminosity 


Total luminosity 




584.0 ± 34.5 


Relative uncertainty 




5.9% 


Total signal acceptance uncertainty 


Relative uncertainty 




17.3% 


Background systematics 


BG shape modeling 


Pseudo-experiments 


8.3% 


BG normalization 


Pseudo-experiments 


4.2% 


BG model choice 


Sideband fit 


+34.3% -15.0% 


Iteration procedure 


fit to data 


1.6% 



Table 5 

Summary of all sources of systematic uncertainties on az x B(Z — ► bb) measure- 
ment. 
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